69 research outputs found

    Localizome: a server for identifying transmembrane topologies and TM helices of eukaryotic proteins utilizing domain information

    Get PDF
    The Localizome server predicts the transmembrane (TM) helix number and TM topology of a user-supplied eukaryotic protein and presents the result as an intuitive graphic representation. It utilizes hmmpfam to detect the presence of Pfam domains and a prediction algorithm, Phobius, to predict the TM helices. The results are combined and checked against the TM topology rules stored in a protein domain database called LocaloDom. LocaloDom is a curated database that contains TM topologies and TM helix numbers of known protein domains. It was constructed from Pfam domains combined with Swiss-Prot annotations and Phobius predictions. The Localizome server corrects the combined results of the user sequence to conform to the rules stored in LocaloDom. Compared with other programs, this server showed the highest accuracy for TM topology prediction: for soluble proteins, the accuracy and coverage were 99 and 75%, respectively, while for TM protein domain regions, they were 96 and 68%, respectively. With a graphical representation of TM topology and TM helix positions with the domain units, the Localizome server is a highly accurate and comprehensive information source for subcellular localization for soluble proteins as well as membrane proteins. The Localizome server can be found at

    Patome: a database server for biological sequence annotation and analysis in issued patents and published patent applications

    Get PDF
    With the advent of automated and high-throughput techniques, the number of patent applications containing biological sequences has been increasing rapidly. However, they have attracted relatively little attention compared to other sequence resources. We have built a database server called Patome, which contains biological sequence data disclosed in patents and published applications, as well as their analysis information. The analysis is divided into two steps. The first is an annotation step in which the disclosed sequences were annotated with RefSeq database. The second is an association step where the sequences were linked to Entrez Gene, OMIM and GO databases, and their results were saved as a geneā€“patent table. From the analysis, we found that 55% of human genes were associated with patenting. The geneā€“patent table can be used to identify whether a particular gene or disease is related to patenting. Patome is available at ; the information is updated bimonthly

    ESTpass: a web-based server for processing and annotating expressed sequence tag (EST) sequences

    Get PDF
    We present a web-based server, called ESTpass, for processing and annotating sequence data from expressed sequence tag (EST) projects. ESTpass accepts a FASTA-formatted EST file and its quality file as inputs, and it then executes a back-end EST analysis pipeline consisting of three consecutive steps. The first is cleansing the input EST sequences. The second is clustering and assembling the cleansed EST sequences using d2_cluster and CAP3 programs and producing putative transcripts. From the CAP3 output, ESTpass detects chimeric EST sequences which are confirmed through comparison with the nr database. The last step is annotating the putative transcript sequences using RefSeq, InterPro, GO and KEGG gene databases according to user-specified options. The major advantages of ESTpass are the integration of cleansing and annotating processes, rigorous chimeric EST detection, exhaustive annotation, and email reporting to inform the user about the progress and to send the analysis results. The ESTpass results include three reports (summary, cleansing and annotation) and download function, as well as graphic statistics. They can be retrieved and downloaded using a standard web browser. The server is available at http://estpass.kobic.re.kr/

    GS2PATH: A web-based integrated analysis tool for finding functional relationships using gene ontology and biochemical pathway data

    Get PDF
    GS2PATH is a Web-based pipeline tool to permit functional enrichment of a given gene set from prior knowledge databases, including gene ontology (GO) database and biological pathway databases. The tool also provides an estimation of gene set enrichment, in GO terms, from the databases of the KEGG and BioCarta pathways, which may allow users to compute and compare functional over-representations. This is especially useful in the perspective of biological pathways such as metabolic, signal transduction, genetic information processing, environmental information processing, cellular process, disease, and drug development. It provides relevant images of biochemical pathways with highlighting of the gene set by customized colors, which can directly assist in the visualization of functional alteration

    iCSDB: an integrated database of CRISPR screens.

    Get PDF
    High-throughput screening based on CRISPR-Cas9 libraries has become an attractive and powerful technique to identify target genes for functional studies. However, accessibility of public data is limited due to the lack of user-friendly utilities and up-to-date resources covering experiments from third parties. Here, we describe iCSDB, an integrated database of CRISPR screening experiments using human cell lines. We compiled two major sources of CRISPR-Cas9 screening: the DepMap portal and BioGRID ORCS. DepMap portal itself is an integrated database that includes three large-scale projects of CRISPR screening. We additionally aggregated CRISPR screens from BioGRID ORCS that is a collection of screening results from PubMed articles. Currently, iCSDB contains 1375 genome-wide screens across 976 human cell lines, covering 28 tissues and 70 cancer types. Importantly, the batch effects from different CRISPR libraries were removed and the screening scores were converted into a single metric to estimate the knockout efficiency. Clinical and molecular information were also integrated to help users to select cell lines of interest readily. Furthermore, we have implemented various interactive tools and viewers to facilitate users to choose, examine and compare the screen results both at the gene and guide RNA levels. iCSDB is available at https://www.kobic.re.kr/icsdb/

    Accurate quantification of transcriptome from RNA-Seq data by effective length normalization

    Get PDF
    We propose a novel, efficient and intuitive approach of estimating mRNA abundances from the whole transcriptome shotgun sequencing (RNA-Seq) data. Our method, NEUMA (Normalization by Expected Uniquely Mappable Area), is based on effective length normalization using uniquely mappable areas of gene and mRNA isoform models. Using the known transcriptome sequence model such as RefSeq, NEUMA pre-computes the numbers of all possible gene-wise and isoform-wise informative reads: the former being sequences mapped to all mRNA isoforms of a single gene exclusively and the latter uniquely mapped to a single mRNA isoform. The results are used to estimate the effective length of genes and transcripts, taking experimental distributions of fragment size into consideration. Quantitative RTā€“PCR based on 27 randomly selected genes in two human cell lines and computer simulation experiments demonstrated superior accuracy of NEUMA over other recently developed methods. NEUMA covers a large proportion of genes and mRNA isoforms and offers a measure of consistency (ā€˜consistency coefficientā€™) for each gene between an independently measured gene-wise level and the sum of the isoform levels. NEUMA is applicable to both paired-end and single-end RNA-Seq data. We propose that NEUMA could make a standard method in quantifying gene transcript levels from RNA-Seq data

    Safety, tolerability of ES16001, a novel varicella zoster virus reactivation inhibitor, in healthy adults

    Get PDF
    Purpose Herpes zoster (HZ), or shingles, is a clinical syndrome resulting from the reactivation of latent varicella zoster virus (VZV) within the sensory ganglia. We evaluated the safety and tolerability of ES16001 (ethanol extract of Elaeocarpus sylvestris var. ellipticus), a novel inhibitor of varicella zoster virus reactivation in healthy adults. Method Single-center, randomized, double-blind, placebo-controlled, single and multiple ascending dose (SAD and MAD, respectively) studies were conducted in 20- to 45-year-old healthy adults without chronic disease. In the SAD study (nā€‰=ā€‰32), subjects randomly received a single oral dose of 240, 480, 960, or 1440Ā mg ES16001 or a placebo. In the MAD study (nā€‰=ā€‰16), subjects randomly received once daily doses of 480 or 960Ā mg ES16001 or a placebo for 5Ā days. The safety and tolerability of the drug were evaluated by monitoring participants treatment emergent adverse events (TEAEs) and vital signs, electrocardiograms (ECGs), physical examinations, and clinical laboratory tests. Results In the SAD study, 11 adverse reactions were seen in 5 subjects, and in the MAD study, 8 adverse reactions were seen in 6 subjects. All adverse reactions were mild, and no serious adverse reactions occurred. The most common adverse reaction was an increase in alanine aminotransferase (ALT), but all test values were in the clinically non-significant range, and their clinical significance was judged to be small considering the fact that most of the test values returned to normal immediately after the end of drug administration. Conclusion ES16001 has good safety and tolerability when administered both once and repeatedly to healthy subjects. Further research is needed to identify any possible drug-induced hepatotoxicity, which appears infrequently. Our findings provide a rationale for further clinical investigations of ES16001 for the prevention of HZ. Trial registration: CRIS, KCT0006066. Registered 7 April 2021ā€”Retrospectively registered, https://cris.nih.go.kr/cris/search/detailSearch.do/19071).This study was funded by Genencell Co. Ltd, Yongin, Korea

    VnD: a structure-centric database of disease-related SNPs and drugs

    Get PDF
    Numerous genetic variations have been found to be related to human diseases. Significant portion of those affect the drug response as well by changing the protein structure and function. Therefore, it is crucial to understand the trilateral relationship among genomic variations, diseases and drugs. We present the variations and drugs (VnD), a consolidated database containing information on diseases, related genes and genetic variations, protein structures and drug information. VnD was built in three steps. First, we integrated various resources systematically to deduce catalogs of disease-related genes, single nucleotide polymorphisms (SNPs), protein mutations and relevant drugs. VnD contains 137ā€‰195 disease-related gene records (13ā€‰940 distinct genes) and 16ā€‰586 genetic variation records (1790 distinct variations). Next, we carried out structure modeling and docking simulation for wild-type and mutant proteins to examine the structural and functional consequences of non-synonymous SNPs in the drug-related genes. Conformational changes in 590 wild-type and 4437 mutant proteins from drug-related genes were included in our database. Finally, we investigated the structural and biochemical properties relevant to drug binding such as the distribution of SNPs in proximal protein pockets, thermo-chemical stability, interactions with drugs and physico-chemical properties. The VnD database, available at http://vnd.kobic.re.kr:8080/VnD/ or vandd.org, would be a useful platform for researchers studying the underlying mechanism for association among genetic variations, diseases and drugs
    • ā€¦
    corecore